Batching CSIDH Group Actions using AVX-512

نویسندگان

چکیده

Commutative Supersingular Isogeny Diffie-Hellman (or CSIDH for short) is a recently-proposed post-quantum key establishment scheme that belongs to the family of isogeny-based cryptosystems. The protocol based on action an ideal class group set supersingular elliptic curves and comes with some very attractive features, e.g. ability serve as “drop-in” replacement standard curve protocol. Unfortunately, execution time prohibitively high many real-world applications, mainly due enormous computational cost underlying action. Consequently, there strong demand optimizations increase efficiency evaluation, which not only important CSIDH, but also related cryptosystems like signature schemes CSI-FiSh SeaSign. In this paper, we explore how AVX-512 vector extensions (incl. AVX-512F AVX-512IFMA) can be utilized optimize constant-time evaluation CSIDH-512 goal of, respectively, maximizing throughput minimizing latency. We introduce different approaches batching actions computing them in SIMD fashion modern Intel processors. particular, present hybrid technique that, when combined optimized (8 × 1)-way prime-field arithmetic, increases by factor 3.64 compared state-of-the-art (non-vectorized) x64 implementation. On other hand, vectorization 2-way aimed reduce latency makes our implementation about 1.54 times faster than state-of-the-art. To best knowledge, paper first demonstrate potential using instructions (resp. decrease latency) CSIDH.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Sorting Algorithms using AVX-512 on Intel Knights Landing

The modern CPU’s design, which is composed of hierarchical memory and SIMD/vectorization capability, governs the potential for algorithms to be transformed into efficient implementations. The release of the AVX-512 changed things radically, and motivated us to search for an efficient sorting algorithm that can take advantage of it. In this paper, we describe the best strategy we have found, whi...

متن کامل

A Novel Hybrid Quicksort Algorithm Vectorized using AVX-512 on Intel Skylake

The modern CPU’s design, which is composed of hierarchical memory and SIMD/vectorization capability, governs the potential for algorithms to be transformed into efficient implementations. The release of the AVX-512 changed things radically, and motivated us to search for an efficient sorting algorithm that can take advantage of it. In this paper, we describe the best strategy we have found, whi...

متن کامل

Computing the Sparse Matrix Vector Product using Block-Based Kernels Without Zero Padding on Processors with AVX-512 Instructions

The sparse matrix-vector product (SpMV) is a fundamental operation in many scientific applications from various fields. The High Performance Computing (HPC) community has therefore continuously invested a lot of effort to provide an efficient SpMV kernel on modern CPU architectures. It has been shown that block-based kernels are helpful to achieve high performance, but also that they are diffic...

متن کامل

ELZAR: Triple Modular Redundancy using Intel AVX

Instruction-Level Redundancy (ILR) is a well known approach to tolerate transient CPU faults. It replicates instructions in a program and inserts periodic checks to detect and correct CPU faults using majority voting, which essentially requires three copies of each instruction and leads to high performance overheads. As SIMD technology can operate simultaneously on several copies of the data, i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IACR transactions on cryptographic hardware and embedded systems

سال: 2021

ISSN: ['2569-2925']

DOI: https://doi.org/10.46586/tches.v2021.i4.618-649